Skip to content

Comments

Field mapping docs#9

Open
conradbzura wants to merge 3 commits intomasterfrom
field-mapping-docs
Open

Field mapping docs#9
conradbzura wants to merge 3 commits intomasterfrom
field-mapping-docs

Conversation

@conradbzura
Copy link
Collaborator

Added field mappings for scraped data for each consortium as service-module-level docstrings.

Document the metadata URL and every TSV column mapped in
transform_to_c2m2, grouped by document level (File, Collection,
Biosample, Subject, DCC) with enriched subsections for extra fields.
Document the Search API URLs, entity matching patterns, and every API
field mapped in the file and collection enrichment passes, grouped by
document level with enriched subsections for DCC-specific fields.
Document the Search API URLs, entity matching strategy, and every API
field mapped across the file, collection, and subject enrichment passes,
grouped by document level with enriched subsections for DCC-specific
fields.
group_name → collections[].extra.hubmap.group_name
visualization → collections[].extra.hubmap.visualization
vitessce-hints → collections[].extra.hubmap.vitessce_hints
metadata → collections[].extra.hubmap.metadata
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't populate the promoted file_type_detailed field for HuBMAP, but this information seems like it can help with visualization.

@conradbzura conradbzura requested a review from nvictus February 20, 2026 14:58
File
~~~~
File accession → local_id
File download URL → access_url, filename (derived)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is fine. Note that there is also a s3_uri key that could be a substitute or fallback. It might be more performant too.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just noticed that they provide azure URLs too now. And you include both in enriched. Good!

Size → size_in_bytes
md5sum → md5
File Status → status
Experiment date released → creation_time
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: check if "release" date in 4DN also maps to this key

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants